12 research outputs found

    QAW: A Quality Assurance Workflow for Ontologies based on Detecting Semantic Regularities - Dataset on SNOMED

    No full text
    <p>This page contains supplementary material for the EKAW 2014  submission with title:</p> <p><em>"QAW: A Quality Assurance Workflow for Ontologies based on Detecting Semantic Regularities" </em>Eleni Mikroyannidi, Manuel Quesada-Mart ́ınez, Dmitry Tsarkov, Jesualdo Tomas Fernandez Breis, Robert Stevens, Ignazio Palmisano.</p> <p> </p> <p>The detection of regularities was done with the Regularity Inspector for Ontologies framework. The project is open source and can be downloaded from the Links below. </p> <p> </p> <p>The fileset contains the data for the qualitative and quantititative analysis that were presented in the paper.</p> <p>In the <strong>qualitative analysis</strong>, six lexical patterns (keywords) that were processed. These are: "chronic","acute", "absent", "present", "right", "left". For these ones the reader can browse and download the following data:</p> <p>1. XML Files with the generic name <strong>keywordsyntacticusage.xml</strong> whichcontainsthedetectedsyntacticregularitiesforthereferencingassertedaxiomsoftheentitiesthatcontainthecorrespondingkeywordintheirlabel.Thereshouldbe6filesintotal(foreachkeyword).</p><p>2. XMLFileswiththegenericname<strong>keyword_syntactic_usage.xml</strong> which contains the detected syntactic regularities for the referencing asserted axioms of the entities that contain the corresponding keyword in their label. There should be 6 files in total (for each keyword).</p> <p>2. XML Files with the generic name <strong>keyword_semantic_usage.xml</strong> which contains the detected syntactic regularities for the referencing asserted axioms of the entities that contain the keyword in their label.</p> <p> </p> <p>3. Text files with the generic name <strong>keywordsyntacticusagereadable.txt</strong>,whichcontainsamorereadableformatwithlabelrenderingofthesyntacticregularities.</p><p> </p><p>3.Textfileswiththegenericname<strong>keyword_syntactic_usage_readable.txt</strong>, which contains a more readable format with label rendering of the syntactic regularities.</p> <p> </p> <p>3. Text files with the generic name <strong>keyword_semantic_usage_readable.txt</strong>, which contains a more readable format with label rendering of the semantic regularities.</p> <p>In the <strong>quantitative analysis</strong>, 308 lexical patterns were processed, and corresponding syntactic and semantic regularities were detected. The dataset that is available for the reader contains the following:</p> <p>1.  <strong>LexAnal_Snomed_2013_NoSensitiveAnalysis_Cov_0.1_100.0.xml</strong>, which contains all lexical patterns that could be detected in the SNOMED-CT version January 2013. </p> <p>2. <strong>Snomed_2013_LexAnal_Full_0.1-0.4Perc_.xml</strong>, which contains all lexical patterns with 0.1%-0.4% lexical pattern threshold. </p> <p>3. <strong>syntactic_regularities_dataset.zip</strong> which contains 308 xml files with the syntactic regularities that were generated by RIO.  </p> <p>4. <strong>semantic_regularities_dataset.zip</strong> which contains 308 xml files with the semantic regularities that were generated by RIO.  </p> <p>5. <strong>quantitative_syntactic_regularity_analysis.csv</strong>, which contains the syntactic regularity stat analysis for the 308 processed cases.</p> <p>6.<strong> quantitative_semantic_regularity_analysis.csv, </strong>which contains the semantic regularity stat analysis for the 308 processed cases.</p> <p> </p> <p></p

    Tradeoffs in Measuring Entity Similarity for Pattern Detection in OWL Ontologies

    No full text
    Abstract. Syntactic regularities are repetitive structures in the asserted axioms of an ontology represented as generalisations, which are axioms with variables. The Regularity Inspector for Ontologies (RIO) is a framework for detecting such regularities in ontologies. Established clustering techniques are applied to the signature of the ontology to detect clusters of similar entities. Clustering depends on pairwise entity distances, which determine the similarity of two entities. In this paper we present three variations on similarity definition that affect pairwise distances and thus the regularities detected. Our analysis explores and compares methods that capture regularities of different granularity; in particular we analyse commonalities and differences between the generalisations and clusters that result from the three variations of similarity and check if they capture dominant patterns in the ontology in the same way. We perform the analysis using the BioPortal corpus and we discuss the tradeoffs of each similarity function.

    Identifying ontology design styles with metrics

    No full text
    Abstract. This paper presents a metrics framework for identifying highlevel ontology composition styles as an abstraction layer for ontology comprehension. The metrics are implemented on top of the OWL API metrics, which have been extended to capture various aspects of an ontology. The metrics are applied to an ontology repository, and using clustering methods underlying design styles have been revealed. We seek to give a high-level view of how an ontology has been &quot;put together&quot; through the identification of authoring style using a range of ontology metrics. This &apos;ontological style&apos; can be considered as a metamodel, giving an intuition of what lies &quot;under the hood&quot; of an ontology. This can be useful in the broader area of ontology comprehension. The results highlighted five design styles covered by the repository describing simple, as well as rich and complex ontologies

    Analysing Syntactic Regularities and Irregularities in SNOMED-CT

    Get PDF
    Motivation: In this paper we demonstrate the usage of RIO; a framework for detecting syntactic regularities using cluster analysis of the entities in the signature of an ontology. Quality assurance in ontologies is vital for their use in real applications, as well as a complex and difficult task. It is also important to have such methods and tools when the ontology lacks documentation and the user cannot consult the ontology developers to understand its construction. One aspect of quality assurance is checking how well an ontology complies with established ‘coding standards’; is the ontology regular in how descriptions of different types of entities are axiomatised? Is there a similar way to describe them and are there any corner cases that are not covered by a pattern? Detection of regularities and irregularities in axiom patterns should provide ontology authors and quality inspectors with a level of abstraction such that compliance to coding standards can be automated. However, there is a lack of such reverse ontology engineering methods and tools. Results: RIO framework allows regularities to be detected in an OWL ontology, i.e. repetitive structures in the axioms of an ontology. We describe the use of standard machine learning approaches to make clusters of similar entities and generalise over their axioms to find regularities. This abstraction allows matches to, and deviations from, an ontology’s patterns to be shown. We demonstrate its usage with the inspection of three modules from SNOMED-CT, a larg

    Analysing syntactic regularities in ontologies

    No full text
    Abstract. Syntactic regularities are repetitive structures of axioms in the asserted form of an ontology. The Regularity Inspector for Ontologies (RIO) is a framework for detecting such regularities in ontologies using cluster analysis. Detection of syntactic regularities can be used to identify parts of an ontology that have a similar syntactic structure, and could therefore provide an intuition of their construction. In this paper, we introduce uniformity in regularities, meaning the degree of diversity of regularities in an ontology. Based on this notion, we present an analysis of syntactic regularities in a variety of ontologies by applying RIO. The selected ontologies are mainly biomedical ontologies; processable BioPortal ontologies and SKOS vocabularies that represent biomedical concepts, gathered from the Web. Our analysis aims to show how syntactic regularities are formulated when a different knowledge representation language (OWL, SKOS) is used. The results have shown that the selected SKOS vocabularies were more uniform in terms of their syntactic regularities; smaller homogeneous clusters were found, and with few generalisations, but of high abstraction level and cluster coverage. Compared to SKOS vocabularies, BioPortal ontologies were regular, but more complex and less uniform. The analysis of syntactic regularities and uniformity of regularities can be helpful for gaining an intuition of the ontology design and its complexity.
    corecore